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Abstract 

This paper develops strategic foundations for an important statistical model of ran- 
dom networks with heterogeneous expected degrees. Based on this, we show how social 
networking services that subtly alter the costs and indirect benefits of relationships can 
cause large changes in behavior and welfare. In the model, agents who value friends 
and friends of friends choose how much to socialize, which increases the probabilities 
of links but is costly. There is a sharp transition from fragmented, sparse equilib- 
rium networks to connected, dense ones when the value of friends of friends crosses a 
cost-dependent threshold. This transition mitigates an extreme inefficiency 
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New Internet services that help people to track, maintain, and create social relationships 
have attracted hundreds of millions of users over the past decade. It has been claimed 
that these innovations — websites such as Facebook, MySpace, Linkedln, and Twitter - 
fundamentally change the network formation dynamics of the groups that use them. For 
example, a 2007 New York Times article stated that "Facebook and other social networks 
like MySpace have transformed the social lives of teenagers in many ways, and that includes 



how they make the transition from high school to college" (Lombardi, 2007). Why did this 
technology, which was an incremental advance in electronic communication, lead to such 
apparently large changes in social interaction? We develop a theory to address that puzzle, 
which explains how small changes in the costs and benefits of direct and indirect relationships 
can lead to large changes in equilibrium networks. 

The model that we develop for this purpose connects the economic theory of rational 
network formation with the random graphs that have become workhorses for modeling so- 
cial networks in statistics, physics, and computer science — namely those of |Chung and Lu 



(2002), built on the seminal work of Erdos and Renyi (1959). While this powerful statistical 
model can fit a wide variety of empirically important networks with arbitrary degree dis- 
tributions^ it lacks rational foundations and an understanding of how linking probabilities 
relate to more fundamental parameters, such as costs and benefits. These are important 
when the agents who populate the networks have some control over their interaction. In- 
deed, as emphasized by Jackson (2008) and Konig, Tessone, and Zenou (2009), the random 
graphs literature has quite successfully addressed "how" links seem to form, but has less to 
say about "why" and about where the parameters in the models come from. To provide 
rational foundations for these models, and to address our motivating puzzle, we focus on a 
simple environment. Agents, such as an entering cohort of students at a business school, 
meet each other for the first time and socialize now to create relationships that may be 
realized in the future. Those realizations are uncertain, but their probabilities increase with 
investment, as agents socialize more within the group. Agents value friends and also friends 
of friends. They trade off the expected direct and indirect benefits of links against the costs 
of socializing, which are agent-specific private information. This model is flexible enough 
to produce networks with arbitrary degree distributions and clustering, and it yields sharp 
predictions about the distributions of important network statistics such as connectedness, 
diameter, and density as functions of the key economic parameters. As a result, it suitable 

lr The degree of an agent is the number of connections he has. Power law distributions, in which the 
proportion of nodes with degree d is falls off as a power of d, have been observed in many applications 
( Newman 2003 ) . As emphasized by Chung, Lu, and Vu ( 2004 ) , the Chung-Lu framework gives a tractable 
probabilistic model of such networks. 
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Connectedness 
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Table 1: A summary of the properties of the two equilibrium regimes; n is the population 
size. The equilibrium falls into the low-intensity regime if the value of friends of friends is 
below the critical threshold r eq , and into the high-intensity regime otherwise. 

for structural modeling. 

This approach goes beyond giving economic interpretations to the parameters of an im- 
portant statistical model. The modeling reveals that there are surprising and highly non- 
linear relationships between the economic fundamentals and the properties of the realized 
networks due to the ways agents best-respond to each other. These results are used to give 
one explanation of the Facebook puzzle. 

The first main result is that equilibrium networks fall into two regimes, depending on the 
parameter values: a connected, high-intensity regime, and a fragmented, low-intensity one. 
These regimes are extremely different, and which one is relevant depends on the comparison 
of the value of friends of friends (an exogenous parameter called v 2 ) to a threshold called r eq 
that is computed based on costs. The properties of the regimes are summarized in Table [TJ 
and some illustrative examples are shown in Figure [TJ When friends of friends are sufficiently 
valuable, with their value exceeding a cost- dependent threshold, agents in equilibrium devote 
a lot of time to socializing, and the expected number of friends each has scales as the square 
root of the population size. As already implied by the name, the networks in this regime 
are connected with very high probability as the population grows large — indeed, there is 
a path of length at most three between any two agents. In contrast, when the value of 
friends of friends falls just slightly below the threshold, agents socialize significantly less, 
and the resulting networks consist of many disconnected pieces. The expected number of 
friends per agent tends to a constant as the network grows large. A striking fact is that 
the value of one's own friends affects neither the location of the transition r eq between these 
equilibrium regimes nor the properties of the connected, high-intensity regime. It only affects 



3 



(a) 




(b) 

Figure 1: Examples showing typical networks formed in equilibrium with n = 400 agents in 
(a) the high-intensity regime and (b) the low-intensity regime. The high-intensity network 
has a single component and many links per node, whereas the low-intensity network is highly 
fragmented. 
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a Relative to equilibrium welfare, for fixed costs and values, as the population size grows. 



Table 2: Comparison between equilibrium and efficient networks, as well as how much of the 
available welfare is lost. When v 2 G (r e g, r eq ), the ratio of available gains to realized gains is 
unbounded, making this intermediate range extremely inefficient. 

the equilibrium degree of each agent in the low- intensity regime. 

The second main result focuses on efficiency. In the case where all agents have a known 
cost parameter, so that the social planner faces no informational problems, we characterize 
"efficient mingling" : what uniform intensities of interaction a utilitarian social planner would 
select. We find that these, too, exhibit a phase transition when the value of friends of friends 
surpasses a key threshold T e g, moving from low intensity to high intensity. However, this 
threshold governing the jump in efficient levels of socializing happens at half the threshold 
governing the jump in the equilibrium levels of socializing. That is, t c q = ^r cq . Between 
these two thresholds, when r cff < v 2 < T eq , we find that efficient levels of socializing exceed 
equilibrium levels by a factor growing to infinity with the population size; the same is true 
of agents' welfare. Outside of this area between the thresholds, equilibrium networks are 
still inefficient, but only by a constant factor for arbitrary network size. That is, the area 
between the thresholds is one of extreme inefficiency, where a vanishingly small fraction of 
available benefits are being extracted. This is summarized in Table [2j 

These results can be interpreted in the context of social networking technology. When 
a new technology comes along that increases the value of friends of friends, for example by 
exposing lists of friends of friends and information about them, agents' equilibrium socializing 
decisions can shift drastically if the starting point was near the critical threshold. The 
technological shift takes the network from a fragmented and extremely inefficient regime 
to a connected and much more efficient one. This effect is only amplified by the fact that 
new technologies arguably also reduce the costs of socializing. Thus, our analysis provides 
a potential underlying mechanism for the large qualitative impact that social networking 
services have been said to have on social life. 

The paper is organized as follows. In Section 1, we discuss how our approach relates to 
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the economic literature on network formation. Next, in Section 2, we formally lay out the 
model. Then we examine equilibrium and efficiency — first, in Section 3, when all agents 
have the same costs, and then, in Section 4, when their costs are private independent draws 
from a commonly known distribution. In Section 5, we discuss extensions of the results to 
highlight their robustness and limitations. There, we (i) endogenize the value of friends of 
friends through the mechanism of people introducing friends to each other; (ii) show that 
the specification of the cost function is not driving the qualitative results; (iii) explain why 
the realized random networks are stable when agents have to pay some costs to maintain 
links. Section 6 concludes. 



1 Related Literature 



The importance of the basic problem of how social networks form has been widely recognized 
in economical and the study of rational network formation has a rich history. One strand 



of this literature, starting with Myerson (1991) and continuing with Jackson and Wolinsky 



(1996), Bala and Goyal (2000), and Hojman and Szeidl (2008), among many others, has has 



studied the stability of certain networks to unilateral and bilateral deviations which translate 
deterministically into changes in the network. The literature is surveyed extensively by 



Jackson 


(2005 


) and 


Jackson 


(2008) 



network insofar as that is important for their deviations, and delivers very specific and often 
stark predictions about network structure. While this has been an extremely important 
approach for understanding aspects of network formation, a different model is appropriate 
for the first-meeting setting that we focus on, as well as for generating the random graphs 
that are our equilibrium predictions. In our model, in contrast to these, any network has 
a positive probability of appearing in equilibrium, though some are much less likely than 
others; moreover, agents are fully aware of the randomness that generates this and take it 
into account when optimizing. This makes the present model a natural fit for structural 
estimation. 



Another strand of the literature, which includes Konig, Tessone, and Zenou (2009), has 
modeled various dynamic processes of network evolution with rational choices; those models 
often include a stochastic aspect in how decisions translate into outcomes. The networks 



2 Social networks affect economic outcomes in a multitude of ways. They influence decisions and outcomes 



relating to employment (Topa, 2001), investment (Duflo and Saez, 2003), risk-sharing (Ambrus, Mobius, and 



Szeidl 2010), education (Calvo-Armengol, Patacchini, and Zenou, 2009), and crime (Glaeser, Sacerdote, and 



Scheinkman, 1996), to name just a few of their effects. See Granovetter (2005) for a broad survey of the 



effects of social networks. 
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generated are thus random, but usually belong by construction to a specific sub-class of 
networks. This approach is similar to ours, though we focus on a simple static framework. 
The main distinction is that the random graphs that come out of our model are closer to 
the standard ones used in the statistical modeling of social networks. 

The analysis closest to ours is a recent one by Cabrales, Calvo-Armengol, and Zenou 
(2009) — henceforth CCZ — which inspired this work. This paper also studies agents who 
choose how much to socialize, spreading their efforts uniformly within a group. The main 
difference is that in the CCZ model, these choices yield links of intermediate strength, so that 
if % mingles with intensity Xi and j mingles with intensity Xj , then there is a link between them 
of strength x^Xj formed with certainty. In our model, there would be a link between them 
formed with probability p (xj, Xj), where p can be a general symmetric function satisfying very 
mild conditions; the presence of a link is discrete, so that it is either present or absent. This 
yields stochastic networks with richer structure, at the expense of more complex probabilistic 
calculations. The modeling differences arise from a difference in motivation: our focus is on 
agents who socialize for the purpose of forming future long-lived connections that may or 
may not be realized, while in the CCZ framework, the emphasis is on the spillovers that 
current socializing creates for current production. The models also yield different insights. 
Due to the smoothness of the interaction function, the CCZ approach permits a full-fledged 
equilibrium/welfare analysis of socializing and production. In contrast, our approach takes a 
reduced-form view of network benefits but allows us to look at the surprising tipping points 
in mingling-based network formation and to model random networks with rational agents. 



2 The Environment 

Basics n > 3 agents want to form relationships (or "links") between them, and they 
start out unlinked. Agent i G {l,..,n} interacts with the other agents according to an 
overall intensity of interaction Xi G [0,1], which is her choice variable. The quantity of 
interaction between agents i and j is given by pij = pji = p(xi,Xj), and a link will form 
in the future between agents i and j with probability equal to the quantity of interaction 
Pij, independently across links. We assume that p : [0, 1] x [0, 1] — >■ [0, 1] is a continuous 
symmetric function, which is strictly increasing on (0,1] x (0,1]. Finally, we assume that 
p(0, 0) = and p(l, 1) = 1, so that if both sides of a future relationship put no effort into 
it, then it has no chance of materializing, and if both are fully committed to it then it will 
succeed for certain. 
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Timing In the first stage agents interact by setting their choice variables xf, they pay the 
interaction costs up-front. At the second stage the social network is realized; we denote it 
by an n-by-n symmetric matrix G. The indicator variable of the presence of the link {i,j} 
is written Gij = Gji G {0, 1}. Agents get benefits from having direct and indirect links in 
the realized network. 

Preferences Agent z's costs are convex in her total quantity of interaction, and explicitly 
take the form: 



where c, is an agent-specific coefficient capturing the cost of social interaction, which is the 
private information of agent i. It will be useful to refer to the inverse c~ l as the sociability of 
agent i, and denote it by s. Agent i gains a value v\ from any friend (a j such that = 1) 
and a value v 2 from each friend of a friend - an agent to whom she is not directly linked, 
but connected through at least one mutual friendj^] Formally, this is any j so that G^ = 
but GikGkj = 1 for some j ^ We assume that Vi > v 2 > 0. Thus, the expected utility of 
agent i can be written as 



Comments on the Modeling Assumptions This is a model of social network formation 
among a large group of people, all meeting each other for the first time. People gain benefits 
from having friends and friends of friends in the future, yet have to spend costly time to form 
new relationships with others. A leading example is that of students coming into an MBA 
program. Some of the main benefits of pursuing an MBA are the social and professional 
connections that can be gained while getting the degree. However, students have only a 
limited amount of time during the program, and socializing displaces other valuable activities. 
Other suitable examples are provided by entering students in other academic programs, new 
recruits in the military, and businesspeople at a conference or trade show. As mentioned in 

3 An agent j who is a friend of i is not called a friend of friend of i even if there is a third agent k who is 
linked to both i and j. In terms of the utility function, this means that one does not get additional value 
from people one knows directly if one is also connected to them indirectly. 

4 One could also model agents as valuing contacts which are removed from them by two or more links in 
the realized network (1*3, U4, . . .). However, we do not view these as realistically having first-order effects on 
agents' considerations in network formation. 




Mj(x) = E [vi • # friends + v 2 • # friends of friends] 
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the introduction, such situations display properties which are captured in the assumptions of 
our model, yet are very different from the ones underlying network-stability models, which 
take some existing network as a point from which to consider deviations. We now explain 
our assumptions in more detail in the context of the motivating examples. 

First, the process of forming new relationships exhibits a substantial amount of fun- 
damental uncertainty, in contrast to the maintenance of existing relationships. When two 
people have known each other for a long time and are willing to make the investment re- 
quired to continue the relationship, it is most likely that the link between them will prevail. 
In contrast, we view the process of forming new relationships as fundamentally uncertain. 
New acquaintances may not maintain a relationship for many reasons: they might not share 
enough common interests; they may move to different locales before the relationship is estab- 
lished; they may realize they do not like each other; or they might simply lose touch because 
of exogenous distractions. Thus, the model features probabilistic network formation. When 
investing effort in socializing, agents can affect the probability of a relationship forming 
but cannot guarantee it unless they both invest the maximum possible amount. Otherwise, 
our model allows for a great deal of generality in how decisions translate into relationship 
probabilities. 

Second, when investing in the formation of a link with other agents, the major cost 
involved is the time it takes to get to know him or her and establish a basis for a potential 
relationship. A substantial portion of this time is spent in the beginning stages of the 
relationship, while benefits from the relationship may be reaped over many years in the 
future. Thus, costs in our model are paid up- front, before the structure of the network is 
determined, while utility from the social network is accrued at the second stage, after links 
are realized. Accordingly, we initially assume away costs of maintaining links after they have 



formed. In Section we introduce maintenance costs into our model and show that they 
do not affect our main results. 



Third, like (2009), we assume that agents cannot target specific individuals when social- 
izing; instead, they interact generally within the group. In the MBA example, this would 
be equivalent to students deciding on how many parties to go to or MBA clubs to join. In 



Section |5.1| we relax this assumption and show that for reasonably large populations and in 
the presence of some decreasing marginal returns to socializing, this assumption is in fact a 
result obtained in equilibrium. 

Fourth, we make the standard assumption that friends of friends are valuable. This is 
consistent with previous models. However, we mention it because it is a key driving force 
in the analysis. At first we will work under that assumption that this benefit from having 
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friends of friends is an exogenous parameter, and will later endogenize it. 

Fifth, we assume that agents time costs are a function of the sum of their probabilities 
of forming relationships. The two essential substantive assumptions behind this are the 
"independence" of the interactions across pairs of agents, and the ex-ante symmetry of all 
pairs in terms of how costly interaction is. 

3 Homogeneous Agents 

We now turn to analyzing the model, first focusing on equilibrium behavior of the strategic 
agents, and then on efficiency In this section we assume that all agents are homogeneous, 
in the sense that they all have the same costs of social interaction time c, and in the next 
section this assumption will be relaxed. All proofs are provided in the appendix. 

3.1 Equilibrium 

The solution concept we use is a symmetric equilibrium in which all agents choose the same 
intensity x. Since we are modeling a process of network formation, where agents start out 
without any prior information about each other, or regarding some target or benchmark 
social network, any equilibrium concept which is asymmetric implicitly assumes some mech- 
anism for coordination. In the absence of such a mechanism, a symmetric equilibrium is a 
particularly compelling concept in our setting. 

In this game there always exists a trivial symmetric equilibrium - the one in which all 
agents socialize with intensity with everyone else. Our first result establishes the existence 
and uniqueness of a nontrivial symmetric equilibrium, in which every agent plays her unique 
best response. 

Theorem 1. A symmetric equilibrium with positive linking probabilities always exists, and is 
unique. This is a strict equilibrium, in the sense that each agent has a unique best response. 

We will denote the probability of a relationship forming between any two agents in a 
symmetric equilibrium by p*. While we can only characterize p* implicitly, this allows us to 
provide comparative statics with respect to the costs and benefits from direct and indirect 
friends. 

Proposition 1. Let p* be the probability of forming relationships in the symmetric equilib- 
rium. If p* < 1, it holds that: 

1. 9 £<o 
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3. For a fixed n, there exists a threshold probability p{n) > such that if p* < p then 
£<0 ; and ifp*>p then g > 0. 

The first two comparative statics we present are relatively straightforward. With rising 
costs agents find it more costly to interact and thus reduce their equilibrium intensity of in- 
teraction. Similarly, increasing the benefits from direct friends directly increases the personal 
benefits from interaction, and hence the equilibrium probability of forming relationships rises. 

The comparative statics with respect to the value of friends of friends are more subtle. 
While an increase in the value of friends of friends does render interaction more valuable, 
it also promotes a free-riding incentive, since an agent does not pay directly for her friends 
of friends. Holding the other parameters fixed, there is a critical level of p* such that for 
equilibrium probabilities above it the free-riding effects is stronger, and for probabilities 
beneath it the threshold the direct effect triumphs. The intuition is the following: with very 
high equilibrium linking probabilities agents form relatively many links, and thus expect to 
have relatively many friends of friends, with frequent overlap — an agent will tend to know 
friends of friends through multiple direct friends. When t> 2 rises, the free-rider incentive 
to reduce social interaction intensities and cutting costs are thus first order and trump the 
increased value in more interaction. When p* is very low, the network is sparse and agents 
expect few friends of friends, and thus the dominating effect is the direct one. 



3.2 Asymptotic Results 

Since we are modeling large social networks of hundreds or thousands of agents, it is in- 
structive to analyze the asymptotic behavior of the equilibrium when n grows large. Our 
main result shows that this behavior can take on two very different forms. To state it for- 
mally we define Fj to be the expected number of friends for agent i in the unique symmetric 
equilibrium, where Fi = (n — l)p*. 

Theorem 2. The network formation symmetric equilibrium is asymptotically governed by 
two possible regimes: 

1. If V2 < r eq = c, then the equilibrium linking probability decays at a rate of 1/n and the 
expected number of friends each agent has in equilibrium converges to a finite number: 



n->oc C — t>2 
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2. If v 2 > T eq , then the equilibrium linking probability decays at a rate of n 2 and the 

expected number of friends each agent has in equilibrium grows to infinity at a rate of 
1 

: 

1 -i 7-, 1 i f v 2\ 

lim n 2 Fi = log 2 I — ) 

ra->oo V C / 

Theorem [2] shows that equilibrium behavior, asymptotically, can fall into two starkly 
different regimes. The governing regime depends on the comparison between u 2 , the value 
of friends of friends, and the cost coefficient c, such that when v 2 crosses over the threshold 
r eq = c, a phase transition occurs. In the high-intensity regime, which governs when t>2 is 
high relative to the cost coefficient c, agents invest long periods of time interacting with 
one another, and the expected equilibrium number of friends grows with the size of the 
population, n, at a scale equal to the square root of n. Intuitively, this expected number 
of direct friends each agent makes in equilibrium, or expected degree, is increasing in the 
value of indirect friends, and decreases in costs. Surprisingly, the asymptotic degree does not 
depend at all on the value of direct friends. This implies that in the high-intensity regime 
the dominant factor compelling agents to interact is the benefit of indirect friends. Since the 
number of friends of friends is, approximately, the square of the number of friends, in the 
limit this benefit completely trumps out the direct benefit of friends. 

In contrast, in the low-intensity-regime, when w 2 is low relative to c, agents interact at 
an intensity which is an order-of-magnitude less than in the high-intensity- regime. The 
expected number of friends each agent makes in equilibrium now tends to a constant as the 
population grows large. This constant is an increasing function of the value of both friends 
and friends of friends, but while it is a linear function of V\, it is a non-linear function of 
t>2, which explodes when t> 2 draws close to the threshold r eq from below. Intuitively, the 
asymptotic expected number of friends is a decreasing function of the cost coefficient. In 
contrast with the high-intensity regime, the value of direct friends has a first order effect on 
the equilibrium intensities since in the low-intensity regime the number of friends of friends 
does not completely overpower the number of direct friends. 

The qualitative difference between the local network behavior in the two regimes results 
in dramatic differences in overall features of the entire network. To describe these differences 
we shall define the following terms: we say that a network G is connected, if for any two 
agents i,j there exists a sequence of agents i\, • • • ,i n linking them, such that G itil = 1, 
Gi n j = 1, and for every 1 < k < n — 1, Gi ky i k+1 = 1; we say that agents i,j are at distance 
k in a network G if the shortest path connecting them in G is of length k; finally, given 
a network G, we define the diameter of G to be the maximum distance between any two 
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agents in G. 

Using classical results from the theory of random networks, we characterize the network- 
level differences between the two regimes in the following result. We say a statement holds 
"asymptotically almost surely" (a.a.s.) if it holds with a probability that tends to 1 as n 
grows. 

Proposition 2. In the high-intensity regime the realized social network is connected asymp- 
totically almost surely, and the diameter of the network converges in probability to 3. 

In the low-intensity reqime the realized network is a.a.s. not connected. If > 1 the 

3 3 J C—V2 

network will a.a.s. contain a single connected component which includes a positive fraction 
of the agents, and if < 1 the network will a.a.s. not contain any component larger than 
O {\og{n)) agents. 

This result shows that the difference between high-intensity and low-intensity regimes 
yields sharp empirical predictions at the macro- network level. High- intensity regime networks 
are connected with a very high probability, so that every any agent is linked, directly or 
indirectly, to any other agent. Moreover, with a probability that tends to 1, any two agents 
are at most three friends away from each other, and there exists a pair of agents that are 
exactly three friends separated from each other. 

Low-intensity networks on the other hand tend to be disconnected. Depending on the 
size of the average degree relative to 1, these networks may display a giant component, 
which connects a positive fraction of the agents to each other. If such a component does 
not exist, the social network tends to display connected components which are tiny when 
compared to the overall size of the network. 

Perhaps the most striking feature of the two regime structure is that the transition 
between the two regimes does not depend on the value of direct friends v\, but only on the 
cost coefficient c and the value of friends of friends, t>2- This implies the moving from one 
regime to the other depends on the burden of personal costs on the one hand, but on the 
other only on those benefits that come from the network structure itself which the agent 
does not control — the indirect benefit of friends of friends. 

Moreover, this characterization of the threshold implies that for some networks, small 
changes in either the costs of social interaction or the in the value of friends of friends can 
cause dramatic differences in the resulting social networks. In light of Proposition [2j these 
differences will manifest both at the individual level, as the average agent will have many 
more friends, yet also in the macro-network level. As we will show below, this shift is also 
associated with a sharp rise in efficiency. 
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3.3 Social Networking Technology 

This result can shed some light on the recent developments in social networking technologies, 
and specifically the massive rise and arguably substantial impact of online social networks 
such as Facebook, MySpace, Linkedln and Twitter. Hundreds of millions of people now 
use these networks regularly, spending, on average, long periods of time using them daily 
(Boyd and Ellison, ( |2007[ ); "Facebook: Statistics" ( |Anon4[20lo| ). While these networks offer 
their user different and perhaps easier forms of connecting with friends, the direct benefits 
of using them (to browse photographs, exchange messages, etc.) are arguably similar to 
other technologies whose impact has been less dramatic. It is clear though, that these 
networks specifically and intentionally increase users' benefits from indirect friends. All of 
the above networks expose a user to the identity of friends of friends, usually providing some 
information about them, such as occupations, photos, hobbies and interests. Moreover, some 
of these tools, like Linkedln, explicitly emphasize the value they add to friends of friends 
by showing users how they can connect to certain individuals or organizations through their 
personal and professional social network. In light of this, our model suggests one answer to 
the question of the real value underlying the success of online social networks — by increasing 
the benefits of indirect friends in real social networks, while also arguably reducing the costs 
of interaction, they may be pushing the formation of social networks beyond the critical 
threshold and into the high-intensity regime. 



3.4 Efficient Symmetric Socializing 

Returning to our model, we can ask within the same framework how far the symmetric 
equilibrium network is from the efficiency — a network with a social planner choosing for 
each agent an optimal level of interaction intensity. Intuitively, our model is characterized 
by only one externality, as strategic agents do not internalize the benefits of acquiring more 
friends would have on other acquired friends through benefits from friends of friends. Thus, 
we would expect the equilibrium intensity levels to be inefficiently too low. Our next result 
confirms this, but also sheds light on the size of this inefficiency. 

Theorem 3. The socially optimal level of linking probabilities is strictly higher than the 
equilibrium one. Efficient interaction is governed by two regimes, separated by a threshold 

T eff = Teq/2: 

1. If v 2 < r e jj then the efficient linking probability, p, decays at a rate of 1/n and the 
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Figure 2: The equilibrium and efficient amounts of socializing (reflected in numbers of 
friends) as t>2, the value of friends of friends, is varied for n = 8000 agents. The thresh- 
old at which the efficient levels of socializing transition into the high regime (r e g = -25) is 
half the threshold at which the equilibrium levels do (r eq = 0.5). Between these thresholds, 
the resulting network is very inefficient. (The cost parameter is c = 0.5, and the value of 
friends is v \ = 1.) 



efficient expected number of friends each agent has converges to a finite number: 
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2. Ifv2 > T e jj, then the efficient linking probability decays at a rate ofn 2 and the efficient 
expected number of friends each agent has grows to infinity at a rate of : 
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Under this regime, agents' welfare tends to infinity. 

We thus have that efficient intensities behave similarly to equilibrium levels and fall into 
one of two regimes which according to the ratio between v 2 and c, once again independently of 
the value V\. However, as expected, equilibrium linking probabilities are lower than efficient. 
Perhaps surprisingly, these inefficiencies do no disappear in the limit. For very low v 2 , such 
that v-i < T e s, and for high v 2 , such that v 2 > r eq , the asymptotic equilibrium probability is 
a fixed fraction of the asymptotic efficient probability. The same holds for agents' welfare. 
However, for intermediate values of v 2 , such that r cff < v 2 < r cq , the inefficiency becomes 
extreme, and efficient probabilities and welfare are infinitely larger than equilibrium ones - 
in equilibrium each agent has only a finite number of friends, while the efficient number of 
friends goes to infinity, along with welfare. In the social networking technology context, this 
implies that the networks for which a small changes in v 2 can promote a phase transition in 
interaction, are exactly those networks suffering the most in terms of efficiency. Thus, the 
importance of social networking technologies is further emphasized. This phenomenon in a 
particular example is illustrated in Figure [2] 

4 Heterogeneous Agents 

We now extend the model to allow for unobserved heterogeneity between agents. Specifically, 
we assume that agent z's cost coefficient c is drawn independently according to a probability 
vector p over a finite set of possible costs C = {ci, ...,c m }, and that the realization of this 
draw is agent z's private information. We assume that the pair (p, C) is commonly known. 
The different cost coefficients may represent two differences between the agents: First, agents 
may differ in how easy it is for them to interact and spend time with others; Second, agents 
may have different marginal costs of time due to different alternative uses they have for that 
time. We will denote by E [C] the expectation under p of the cost coefficient, and similarly 
by E [S] — E [1/C], the expectation of sociability. 

For simplicity, we assume in this section a specific functional form for the function linking 
interaction intensities and link realization probabilities: p(x, y) = xy. This functional form 
satisfies all our original assumptions on the function p. Although we make this specific choice 
for p, the results of this section generalize to a wide class of functions. 

In this context, we continue to use a symmetric equilibrium as the solution concept, where 
any two agents with the same realized costs use the same intensities. We first establish that a 
symmetric equilibrium exists in this extended context, and that it is still a strict equilibrium. 
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Theorem 4. There exists a symmetric equilibrium with strictly positive interaction inten- 
sities. In this equilibrium agents use their unique best responses. 



We next turn to characterize the asymptotic behavior of equilibrium relationship realiza- 
tion probabilities. Perhaps surprisingly, the increased complexity of the environment does 
not change the qualitative nature of the asymptotic convergence when compared with the 
homogeneous case. However, the threshold separating the two regimes takes on a more 
complicated form, which depends on the distribution of costs. 

Theorem 5. The symmetric equilibrium with heterogeneous, private costs, is asymptot- 
ically unique, and governed by two possible regimes which are separated by a threshold 
T eq = E[S] /E [S 2 ]: 

1. If v 2 < r eq , then the equilibrium linking probabilities decay at a rate o/l/n and the 
expected number of friends each agent has in equilibrium converges to a finite number, 
so that for agent i with private cost c^. 

lim Fa 1 



n^oo E [S] - v 2 E [S 2 ] 

2. If v 2 > r eq , then the equilibrium linking probabilities decay at a rate of rT* and the 
expected number of friends each agent has in equilibrium grows to infinity at a rate of 
, and is independent of v\. 

In analogy with Theorem [2j we have that with heterogeneous agents holding private 
information, the symmetric equilibrium can behave according to two regimes. In the high- 
intensity regime, which governs when the v 2 is higher than the threshold r eq , the expected 
equilibrium number of friends agents make, regardless of their private costs, converges to 
infinity at a rate of \fn. The exact rate of convergence is characterized implicitly in the 
proof of Theorem [5j and is independent of V\. Following the same arguments as in the 
homogeneous case, the resulting network in the will be connected, and the diameter of the 
network will be three, asymptotically almost surely. Perhaps surprisingly, these results hold 
regardless of the exact nature of the distribution of costs in the population. 

In the low-intensity regime, which governs when v 2 is higher than the threshold r eq , 
the expected equilibrium number of friends converges when n grows large to a fixed number, 
which is proportional to the agents' sociabilities. The asymptotic expected number of friends 
is increasing linearly with v\ in this regime, and increasing non-linearly in v 2 , exploding when 
v 2 is close to the threshold r eq from below. As in the homogeneous case, the resulting network 
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will be disconnected asymptotically almost surely, and the existence of the giant component 
will depend on the comparison between the expected equilibrium number of friends and 1. 

The threshold level itself, r eq , is the ratio between the first two moments of the distribu- 
tion of costs. Rewriting this threshold as: 

_ E\S] 
Teq ~ E[S] 2 + Var [S] 

yields the following comparative statics. 

Corollary 1. r eg decreases with mean-preserving spreads of the distribution of sociabilities. 
A variance-preserving increase in the mean of the distribution of sociabilities will increase r eq 
when the mean is low compared with the standard deviation, and will decrease it otherwise. 

The first comparative static suggests that the presence of some agents with higher so- 
ciabilities, or lower cost coefficients, even at the expense of agents with lower sociabilities, 
can be crucial for a forming group to reach the beneficial high-intensity regime. This implies 
that highly sociable individuals, which will in equilibrium obtain many friends, and this will 
provide their friends with many indirect links, may be the key for a highly connected group. 

The second comparative static shows, perhaps surprisingly, that increasing the sociabil- 
ity of all agents in society by the same amount can be detrimental to transition into the 
high-intensity regime, and this is when the mean sociability is high relative to its standard 
deviation. The intuition for this counter-intuitive result is free-riding — when all agents' 
sociability increases, agents with low sociabilities receive a higher increase in percentage 
terms. Agents with higher sociabilities, who are not that far apart because of the relatively 
low variance, will thus have a greater incentive to free-ride, and this might trump the direct 
effect of lowered costs. 



5 Extensions 

5.1 Allowing Discrimination: Mingling as an Equilibrium 

In the description of our game, we assumed that agents choose one intensity for socializing 
within the group in general, without the possibility of discriminating. While this can be 
motivated as a reasonable restriction based on the difficulty of coordinating and focusing 
on specific others at the early stages of interactions, as in Cabrales, Calvo-Armengol, and 



Zenou (2009), we do not have to view this as a restriction. Indeed, we can enrich the model 
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to one in which each agent i chooses an intensity Xij to direct at each other agent j, and 
the probability that they link is the quantity of interaction p^ = p (x^, Xji). The rest of the 
model is unchanged. 

When the baseline model is enriched in this way, mingling — devoting equal intensity to 
every other agent — is not an equilibrium. However, if we add a small and realistic pertur- 
bation to the theory — namely, by supposing that the probability of forming a friendship is 
slightly concave in the quantity of interaction — mingling becomes a strict equilibrium. The 
predictions of that nearby model — in terms of the mingling intensities and the networks 
that are formed — are close to the baseline model. In this section, we detail these points. 

First, it is useful to understand precisely why mingling is not an equilibrium. If everyone 
else were mingling, agents would prefer to deviate and focus their efforts only on some 
subset of others. The feature driving this phenomenon is the fear of overlap between the 
neighborhoods of friends. For an illustration of this, suppose that an agent i has only two 
potential friends j and k, and each of them will have 3 friends, in expectation, in addition 
to i. In expectation, there will be one agent other than i who is a friend of both j and k. 
Suppose also that j and k both direct intensities Xji = Xki = y at agent i. If agent i fixes the 
sum of intensities x^ + Xik and decides how to apportion his socializing, it is straightforward 
to verify that he prefers to focus on one agent, either j or k. This is because of the convexity 
of benefits introduced by the overlap: linking to both is less than twice as good as linking 
to a single one. 

This overlap problem is a minor artifact rather than a major consideration because the 
amount of overlap in equilibrium will be relatively tiny. To make this precise, suppose we 
introduce a small amount of concavity in the function that maps total time spent together 
to the probability of a link forming. Equivalently, via a reparameterization, let us assume 
that the costs of interaction time are slightly convex, so that the utility function takes the 
form 

c ( V 

■Uj(x) = E [v± • ^friends + v 2 ■ # friends of friends] — - I ^^Pij I 

V 3+i J 

for some (3 > 1 — where we have, of course, chosen a simple form of convexity. 

In this framework, it can be shown that, for any n, there is a range \J3, 0\ such that, when 
(3 is in this range, mingling is a strict equilibrium, and the mingling intensity is within a 
specified distance (say, 10%) of that of the baseline model. While this range will depend on 
n, our preliminary calculations show that it is reasonably large for the values of n where the 
theory would be relevant — populations of sizes between several dozen and several thousand. 
The precise calculations will be included in a later draft. 



19 



The intuition, however, is simple. Mingling imposes only a mild constraint on the agents 
because overlap in friends' neighborhoods is tiny. Adding a slight amount of decreasing 
marginal returns is enough to make mingling strictly the best response to mingling 

5.2 Perturbations of the Utility Function 

While the linear-quadratic specification is standard and obviously advantageous from the 
standpoint of tractability, one might naturally wonder about whether the two regimes and 
other stark features of the model are driven by the particular parameters chosen for the 
analysis. The answer is that while the specification and the asymptotics we have focused on 
are useful tools for analysis and exposition, the actual predictions about equilibrium behavior 
for finite populations are robust to perturbations in agents' utility function. 
To explore this issue, we consider generalizing the utility function to: 



Mj(x) = E [vi ■ # friends + t>2 • # friends of friends] — 




where a > 1 is a parameter. This captures the notion that costs of socializing might not 
scale exactly quadratically in total quantity of itneractionj^] 

If a > 2, then arguments similar to those in the appendix show that the low-intensity 
regime is the only one that survives asymptotically; our numerical calculations for finite 
populations indicate that, indeed, intensities in equilibrium are low in finite populations. 
When a < 2, asymptotically the high-intensity regime is seen over the whole range of cost 
and benefit parameters. However, in finite populations, we still see a reflection of the same 
basic pattern identified by our asymptotic results with a = 2. When the value of friends 
of friends is very low, equilibrium intensities are low, but they rapidly shift to a high level 
when f 2 surpasses a critical level. The main difference is that as a is reduced, that threshold 
occurs earlier. This is illustrated in Figure [3j 

The basic message is that working with a = 2 and the asymptotic regime is a convenient 
expository device. It helps us identify the two regimes in a simple way and characterize 
their properties. But the numerics show that in reasonably large finite populations, less 
convex cost functions give qualitatively similar equilibria. Characterizing the cutoffs between 
regimes exactly as functions of n for values of a other than 2 would be an interesting extension 
of the analysis. 

D This can also be seen, via a reparameterization, as varying the concavity of benefits as opposed to the 
convexity of costs. 
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Figure 3: Equilibrium numbers of friends when the model is perturbed and the costs of time 
scale as time to the power a. In the baseline model, a = 2. The numerical calculations 
in this network of n — 8000 agents shows that the same basic pattern identified in the 
asymptotic analysis continues to hold. As costs grow less convex, the threshold at which 
the equilibrium shifts to the high- intensity regime occurs at a lower value of t>2. (The cost 
parameter is c = 0.5 and the value of friends is v± = 1.) 



5.3 Maintenance Costs and the Stability of the Random Networks 

For simplicity of exposition, we have focused on the up-front costs of link formation, and 
have ignored the costs of maintaining links. Such costs introduce stability considerations; it 
may be that an agent does not wish to maintain a link once it has formed, because the link 
requires more effort to maintain than the marginal benefit it returns. We say that a network 
is unilaterally stable if no agent wants to sever a link for this reason. 

It turns out that maintenance costs are simple to incorporate into the model, and there 
is a reasonable range of parameter values for which the networks formed in the equilibrium 
of the formation game are stable asymptotically almost surely. To be precise, suppose that, 
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to maintain a link and receive the benefits, an agent must pay a cost of c in addition to the 
up-front costs. Here costs and benefits are interpreted as flows to be received in the future. 
We say a network is not unilaterally stable if there is some agent i and some link incident 
to i so that deleting the link results in a decrease in the network-based benefits to i (from 
friends and friends of friends) of less than c. 

A link to agent j confers a marginal benefit of at least V\ — v<l upon agent i. This is 
because there may be some k who is linked to both i and j, so that if i severs the link to j, 
she will still receive a benefit v 2 from being indirectly connected to j. Thus, if c < v 1 — v%, any 
network that forms will be unilaterally stable. The following proposition says that, in the 
low-intensity regime, this condition is not only sufficient, but also necessary, for the formed 
network to be unilaterally stable asymptotically almost surely. That is, if c > v\ — 1)2, then 
some agent will wish to sever a link with a probability bounded away from even as n grows 
very large. 

Proposition 3. In the low-intensity regime, the network created by the formation game is 
a.a.s. unilaterally stable if and only if c < v\ — i>2- 

The proposition shows that, in general, when c > v 1 — v 2, the endogenous severing of links 
after the network is formed is a live possibility in at least one regime, which substantially 
alters the agents' expected utilities at the formation stage because they have to account for 
the possibility that some of their links will vanish. The analysis of the ways in which agents' 
formation-stage calculations change as a result of this possibility would be an interesting 
direction for further studyH 



On the other hand, when c < v± — V2, stability considerations don't introduce any strategic 
complexity, because no links are severed after they are formed. The only issue is that agents' 
expected benefits from a direct friendship change from v\ to v\ = V\ — c, because they will 
have to pay this maintenance costQ Subject to this replacement, the theory goes through 
unchanged. 

In short, while stability considerations can introduce substantial complexities into the 
analysis, there is a range of maintenance costs in which the networks formed in equilibrium 
s. robust to the possibility of unilateral deviation and agents' formation-stage ac- 

6 In the high-intensity regime, we conjecture that the network is a.a.s. unilaterally stable for any finite 
c. This is because each friend of i provides indirect connections to a large number of agents whom i would 
not have access to through other friends. We also conjecture that allowing Vi to be large and negative to 
account for agents foreseeing the maintenance cost would not affect the analysis of that regime. 

7 We make the inequality c < v\ — «2 strict in order to ensure that v\ > 0, which is assumption that we 
have maintained throughout the analysis of the formation stage. 
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counting for the maintenance costs of links changes is straightforward to incorporate into 
the basic model. 



6 Concluding Remarks 

This model of network formation with rational agents and uncertainty in the realization of 
links has two main appealing features. First, the networks it predicts have the complex and 



irregular structure seen in real networks (Newman, 2003); moreover, they correspond to the 



random network models recently developed in the probability literature (Chung and Lu, 2002 



Chung et al. 2004). At the same time, the model does not rely on mechanistic foundations 
for link formation; the probabilities of links are endogenous choice variables that are selected 
when agents optimize, trading off the costs of socializing against the expected benefits. From 
a technical perspective, the fact that there is uncertainty over the precise realizations of the 
links, along with convex costs of socializing, suffices to pin down equilibrium choices, in 
contrast to models of network formation where there is a multiplicity of equilibria. 

The main results of the paper serve as an illustration of the ways in which the simple 
framework can generate nontrivial predictions about how the economic fundamentals affect 
equilibrium and efficiency. In the particular application considered here, we showed that 
small changes in the value of friends-of-friends can change the orders of growth of social 
activity and the fundamental shapes of equilibrium networks. The framework is capable of 
accommodating other specifications of costs and benefits - for instance, ones that depend 
on how many mutual friends are between two agents, or on properties such as transitivity. 

While we made some first steps toward connecting rational random network formation 



and the stability of networks in Section |5.3[ much remains to be done in exploring which 
formation processes result in persistent networks. This issue becomes more prominent when 
larger classes of deviations are admitted or when the network-based benefits take a more 
complex form, and would be an interesting direction for further research. 
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Appendix 



Expected Number of Friends of Friends A calculation that will come in handy for most 
of the proofs is explicitly determining the expected number of friends of friends for agent i, given 
the intensity levels {_ x j}\<j< n - We claim it is the following expression: 

^ (1 - p(xi, x k )) 1 - (1 - p(xi, xi)p(x h x k )) 

ky^i \ ly^i,k 

First, the index k sums over all possible friends of friends k 7^ i. For k to be a friend of a friend, 
she needs to not be a direct friend, but to be a friend of some direct friend of i. The first term in 
the summand for agent k, is the probability that % and k are not direct friends. The second term 
is the probability of the complement of k not being a friend of any of i's friends, which is exactly 
the event that k is indeed a friend of a friend of i. Notice that the realization of the different links 
is independent across links, and thus the expectation is just this product. 

Proof of Theorem [T] Searching for a symmetric equilibrium with positive intensities, assume 
that all other agents except agent i choose x G (0, 1] as their interaction intensity. Then, agent i's 
optimization problem is given by: 



max vi VVxj, x) + -p(xi,x)) 1 - TT (1 - p(x i} x)p(x, x)) - ^ y^p(xi,x) 

Xi appears in this maximization problem only as an argument of the function p(-,x). Since the 
function p(-, x) is strictly increasing in its argument, the FOC for the original problem hold if and 
only if it holds for the problem where we view player i as choosing p(xi, x), taking p(x, x) as fixed. 

Thus, the equilibrium linking probability p(x, x) = p* , if it is internal, must satisfy the following 
FOC: 



d_ 

dp 



vi ^P + v 2 ^2 i 1 - P) 1 ~ II ( X ~ V- P( x > x )) 



k^i k^i 



C 

2 



k+i 



p(x,x)=p 



This FOC, after rearrangement, is given by: 

(ui - v 2 ) + v 2 (l - p) n - 2 (l + p) n ~ 3 (l + (n - l)p) 



(1) 



(n — l)p 

Since p(0, 0) = 0, and p(x,y) is continuous, the LHS as a function of p tends to positive infinity at 
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(from above), and has derivative: 

(vi - v 2 ) + (1 - p) n ~ 3 (l + p) n ~ A (1 + p + (3n - 7)p 2 + (n - l)(2n - 5)p 3 ) v 2 

(n — l)p 2 



< 



which is strictly negative for p £ (0, 1]. Thus, if there exists p* £ (0, 1] which solves the FOC, it is 
the unique symmetric equilibrium intensity. Otherwise, the investment level p* = 1 for all agents 
constitutes a unique symmetric equilibrium. I 

Proof of Proposition [T] Implicitly differentiating p* with respect to c using the FOC 
yields: 

dp* (n- l){p*) 2 

~dc " ~(>i - v 2 ) + (1 -p*) n - 3 (l +p*) n ~*(l+p* + (3n - 7)(p*) 2 + (n - l)(2n - 5)(p*) 3 )v 2 

which is strictly negative. 

For the comparative static with respect to v\, implicitly differentiating the FOC ([!]) yields: 

dp* _ p* 

dvx ( Vl - v 2 ) + (1 -p*) n ~ 3 (1 +p*) n " 4 (1 +p* + (3n - 7)(p*) 2 + (n - l)(2n - 5)(p*) 3 )u 2 

which is strictly positive. 

For the comparative statics with respect to v 2 , first we again implicitly differentiate x* with 
respect to v 2 : 

Op* _ p* ((1 - p*) n - 2 (l + p*) n - 3 (l + (n - l)p*) - l) 

dv 2 ( Vl - V2 ) + (l _p*)™- 3 (i +p *)n-4 (i +p * + ( 3n _ 7)(p*)2 + ( n - l)(2n - 5)(p*) 3 ) v 2 

since the denominator is always positive, the sign of the entire derivative is determined by the sign 
of: 

((1 - p*) n ~ 2 (1 + p*) n - 3 (1 + (n - l)p*) - l) (2) 
This expression, as a function of p* , is at 0, converges to —1 from above at 1, and has derivative: 

_( n - 2)(1 - p*)"- 3 (l + p*) n -\2(n - l)(p*) 2 + 3p* - 1) 

which is positive just right of 0, and becomes negative before one, changing sign once. Thus, (§ is 
first negative and then positive, changing signs only once. This completes the proof. I 

Proof of Theorem [2] Consider the FOC in Q, which we rewrite as: 

_ (vi ~ V2) + V2(l ~ P) (1 ~ pT' 3 (! + ("- ^ 
C ~ (n-l)p {6) 
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. We first claim that it must hold in the symmetric equilibrium for n large enough. To see this, 
by the proof Theorem [T] we have that if the FOC does not hold in equilibrium, then the symmetric 
equilibrium is the one where all linking probabilities are 1, and hence for all j, Xj = 1. i's marginal 
utility (in p(xi, 1)) at those equilibrium intensity levels, when Xj = 1 for all j,k ^ % is: 

Vl - V 2 
(^1) " C 

which becomes negative for large enough n. Thus, for large enough n this could not be the equi- 
librium, and the symmetric equilibrium must satisfy the FOC. 

We now claim that p* — > as n — > oo. Assume otherwise, then there exists a sequence — > oo 
with, p* (n^) > e for some e > 0. Along this subsequence, the denominator of the RHS of ^ is going 
to infinity. The numerator of the RHS of Q however goes to v\ — v%. Thus, the RHS converges 
along this subsequence to 0, yielding a contradiction, as the LHS is fixed at c for any n. 

Using this, we consider the following cases: 

1. limsup n ^ 00 n2p*(n) = 0. 
This implies that 

lim (1 - (p*) 2 ) n ~ 3 = 1 
n— »oo 

and thus when n goes to oo the dominant term in the numerator of the RHS of ([3| is 
+ V2(n — l)p*, which implies that under this assumption (|3| yields: 

c = lim ^+ttt(n -l)p>) (4) 
n->oo (n — \)p*(n) 

If liminf = 0, and this limit is realized along some subsequence n^, then looking at 

Q along this subsequence yields c = fi/0, which is a contradiction. If lim sup p*(n)n = oo, 
and this limit is achieved along some subsequence n^, then Q along this subsequence yields 
c = V2, which does not hold for any of our cases. 

This leaves the case where all the partial limits of p*(n)n are strictly positive, finite numbers. 
Let d be such a partial limit, which is achieved along a subsequence n^. Taking the limit of 
Q along this subsequence yields: 

v\ + v 2 d 



which gives 



d 



C-V2 

Since the partial limit is pinned down by this equation, it must be that it is the true limit of 
p*(n)n. Additionally, for this limit to make sense we must have that c > 
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Summing up, if limsup n _ >00 n2p*(n) = 0, then if v ^ c 2 , we must have that v 2 < c and that 
lim^oo p^n)n = 



1 



2. liminfjj^oo nip*(n) = 00. 

Under this assumption we have that: 

lim (1-Cp*) 2 )™" 3 = 

n— >-oo 

and thus the in the limit, the dominant term in the numerator of the RHS of ([3]) is v\ — v 2 , 
while the denominator converges to infinity. This implies that taking n — > 00 in ([3]) yields 
c = 0, a contradiction. 

3. Every partial limit of p*(n)n2 is a finite, positive number. 

In this case, take some subsequence nk such that along it we have p*(n)n,2 — > d, for some 
positive number d. Taking a partial limit of ([3]) along this subsequence yields: 

-d 2 



v 2 de d _ d 2 
; = v 2 e 



Solving for d, we have that: 



d = log 

' c 



V2 \ \ 



This implies that there is unique possible partial limit for p*(n)nz, and thus this sequence 
converges to log f^J 2 . For this to be well defined, we must have v 2 > c. 

Since these three cases exhaust all the possibilities, this completes the proof. I 
Proof of Proposition [2] These results are standard results in random graph theory. See 



Jackson (2008), Theorem 4.1 for the results on connectedness and sections 4.2.4, 4.2.5 for the 
results on the size and existence of the giant component. 

For the diameter of the connected network in the high-intensity regime, Corollary 10.12(i) in 



Bollobas (2001) gives the result directly. I 



Proof of Theorem [3] The proof follows the same arguments, word-for-word, as the proof of 
Theorem [2] applied to the homogeneous agent case, but for the FOC of the social planner instead 
of that of the individual agent. This FOC is derived by differentiating the agents' utility function 
with respect to p, given that all realization probabilities are p, and is explicitly given by: 

_ v 1 -v 2 + (l- P) n ' 2 (1 + p) n ~ 3 (1 + (2n - 3)p) v 2 

(n — l)p 
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The claim follows. 



I 



Proof of Theorem [4] The proof follows the standard Kakutani fixed-point approach to showing 
the existence of a symmetric equilibrium when utilities are concave and the choice set of each agent 
is convex. The only delicate point is that the strategy profile where all agents choose intensity is 
an equilibrium, and so one must show that there is an equilibrium in addition to this. This is done 
by restricting agents to play intensities strictly greater than e and showing the constraint does not 
bind. The details will be completed in the final draft. I 

Proof of Theorem [5] Let us first derive the FOC for agent i with private cost coefficient Ch- 
Denote by x* h the equilibrium strategy that an agent with cost coefficient Ch uses in equilibrium. 
Let us denote by {X^}]^^ independent random variables which take on value x* h with a probability 
p(c/i). In other words, Xk expresses the intensity of agent k from the perspective of agent i, in 
equilibrium. Agent i's maximizes: 



kj^i k^i 



[l - xX k ) [ 1 - H (1 - xXfX k ) 
l^i,k 



Ch 



E 



(1-xXx) ( 1- J] (1-xXiXi) 



(n - l)vxxE [Xi] + v 2 (n - 1)1 

(n - l)v lX E [Xi] + v 2 (n - 1)E [(1 - xX{) (l - (l - xE [Xf] X^ 2 



ChX 



E 



ChX 



E 



k+t 



by the law of iterated expectation, and the fact that the X^s are i.i.d. random variables. When 
taking FOCs, we can legitimately take derivative under the expectation sign, as the distribution 
over costs is finite. This yields: 



Ch 



E [Xf] + (n-2)E[X 1 ]' 



(in - v 2 )E [Xt] 



+ v 2 E 



X 1 (1 - x*E [Xf] Xi) n 2 + (1 - xlX^n - 2)X X E [Xf] (l - x* h E [Xf] Xi) 



n—3 



which can be rewritten as: 



Ch 



(vi - v 2 )E [Xi] + v 2 E 


X 1 (1 - x* h E [Xf] Xi) n 3 (1 + (n - 2)E [Xf] - (n - 1)<XiE [Xf]) 


x* h (E [Xf]+(n-2)E [X,] 2 ] 





(5) 

We first claim that the equilibrium x* h goes to with n, for all possible costs Ch € C. In 
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particular this implies that from some n onwards the FOC holds with equality for x* h for all h. 
Assume otherwise. This means that the FOC holds with the LHS<RHS for some Ch and a sequence 
nk — > oo, such that for any it holds that x* h > e > 0. This implies a contradiction, since in ([5| 
the denominator of the FOC of the RHS grows to infinity: 

lim 4 (e \X 2 } + (n fc - 2)E [X^ 2 ) > lim e 3 p h (n k - 2) = oo 

fc— >oo \ J fc— >oo 

while the numerator is bounded: 



lim (vi - v 2 )E [Xi] + v 2 E 

fc— >oo 



X x (1 - x* h E [X 2 ] X x ) n 3 (1 + (n - 2)E [X 2 



(n 



l)xlXiK[xf\) 



< lim (v! - v 2 )M [X^ + v 2 E Xi (1 - e) n ^ (l 

fc— >oo 

< lim (vi - v 2 )E [Xi] < oo 

fc— >oo 



[n 



2)E [Xf 



in 



Assume next that lim sup max n 1//4 a^ = 0. This implies that for any h: 

n->oo l<h<m 



lim (1 - x* h E [Xf] Xi)' 



n fc -3 



fc— >oo 



and thus the dominant term in the numerator of the RHS of (|5| is V\M [Xi] + v 2 (n — 2)E [X\] E [X 2 ] 
while the dominant term in the denominator is x/ l (n — 2)E [Xi] 2 , giving us that: 



Ch = lim 



vi + v 2 (n-2)E[Xf 
n-Ti x*(n-2)E[Xi] 



(6) 



which also implies that for any hi,h 2 : 



Ch 



lim 



x 



h 2 



2 n ^°° » fcl 



so the equilibrium intensities converge together. 
Consider three cases: 



1. If lim inf x^n 1 / 2 = along a subsequence n&, then looking at (|6|) along this subsequence 

n— ¥oo 

yields Ch = v\/0, which is a contradiction. 

2. If lim sup Xhn 1 / 2 = oo along a subsequence rik, then looking at (|6|) along this subsequence 

n— >oo 
yields: 



Ch 



v 2 E [X? 
k^c x* h E [Xi 



lim 
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Inverting this formula, and writing in terms of sociability instead of cost, yields: 



lim M 4 _ lim KM 

*->oo t) 2 E [Xf] »-»oo ii?JE[X 2 ] 2 



Taking expectation with respect to p of both terms and dividing the first by the second, 
yields: 

E[S] 

V2 ~ 



E[S 2 } 

which does not hold for any of our cases. 



3. Assume that for some h some partial limit along a subsequence satisfies lim^oo x^n 1 / 2 = 
dh > 0. Because the x* h converge together, we have that for any h' there exists some dy such 
that lim^oo x* h ,n 1 / 2 = dh' > 0. (JHJ) thus becomes, with slight abuse of notation: 

_ Vl + v 2 E [d 2 ] 
Ch ~ d h E [d] [7) 

which also implies that for any h, h' it holds that dy = d^Ch/cy. Plugging the expressions 
for the other h' back into Q yields: 

vi + v 2 d 2 c 2 Ex [S 2 ] 

Ch = ~ 



d{c h E [S] 
which is solved by: 

dh = s h 



i 



E[S] -v 2 E[S 2 }, 

Since the limit is pinned down, it is the true limit of the sequence x* h n 1 / 2 . Iiv2 > E [S] /E [S 12 ] , 
then we have a contradiction. 

The above shows that if lim sup max n l ^x* h = 0, then v 2 < E[S] /E [S 2 ] and the limits in 

n->oo l<h<m 

part (1) of the theorem hold. 

Next, assume that liminf min n l ^x* h = oo, along some subsequence n&. This implies that: 

n— >oo Kh<m 



lim (1 - x* h E [Xf] Xi 



.rik— 3 



k— >oo 



and thus the dominant term in the numerator of the RHS of ^ is (vi — v 2 )E[Xi], while the 
dominant term in the denominator is Xh{n k — 2)E [A"i] 2 , giving us that: 

r vi-v 2 
Ch = nm - — - — = I) 

fc^oo [n k )x h E Ai 
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by the assumption that lim^ maxK^m n l l A x* h = 0. This is a contradiction. 

Thus, the only remaining case is that for every h and every subsequence — > oo we have that 
linifc rrjjjn 1 / 4 converges to a finite positive number, dh- Assuming this is the case, the FOC in (|5j): 



■; 2 E [d 2 ] E 



Ch 



de -d h <m[d?\ 



d h E [d] 



(8) 



where the expectations are taken over d as a random variable which assumes value dh with proba- 
bility p(cfo). Inverting this equation and taking expectation with respect to p yields: 



E [S] = E 



d'E [df 



v 2 E [d 2 ] E [de- 



-d'dE[d?\\ 



E [S 2 



E 



(d'YE [dy 



jE [d?yE[de- d,dE [<P]] 



where the outer expectation is taken over d' , a random variable which is distributed identically to 
d. Taking the ratio of these two expressions yields: 



E[S] 
EAS 2 ' 



E 



v 2 - 



d'E[d} 2 



E[d 2 ]E 



de 



-d'dE[d 2 ] 



E 



(d') 2 E[d] 4 



E[d?] 2 E\de- d ' d¥ -i d2 \ 



(9) 



We will make use of the following lemma. 
Lemma 1. 



E 



d'E [d] 2 
E[d 2 ]E[de~ d ' dE id 2 ]] 

By Lemma [T] and Q we have that: 



< E 



{d') 2 E [d] 4 
E[d 2 ] 2 E [de- d ' d nd 2 ]] 2 



El 



E[S 2 } 



< V2 



and thus when lim/j xt n 1 / 4 converges to a positive number, we must be in the high-intensity regime. 



This completes the proof. 



I 



Proof of Lemma [T] It holds that: 



E 



d'E [d] 2 
E [d 2 ] E [de~ d,dE l d2 l] 



E 



(d') 2 E [d] 4 
E [d 2 ] 2 E [de~ d ' d n<PV 2 



< 



if and only if: 



E 



d' 



(d') 2 E [d] 2 



E [de- d,dE W] I E[d 2 ]E [de-d'm*]] 



< 
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bmce e < 1 as d, d' are strictly positive random variables, it holds that: 



E 



d' 



(d') 2 E [dy 



E[de- d ' dE ^] \ E[d 2 ]E[de- d ' dE ld 2 }] 



< E 



E 



' (dTEjd] 



(10) 



d' — — LI Jf P ) is a random variable with expectation 0, which is positive for values of d' satisi- 



fying < d 1 < , and negative for higher values. 1/E 



is a positive function in d' 



which is strictly increasing. Thus, ( 10 ) must be strictly negative, which completes the proof. 



Proof of Proposition [3] The fact that c < v\ — t>2 is sufficient for unilateral stability is obvious. 
Now suppose that c > v\ — V2- We will show that the probability that some agent wants to sever 
a link is bounded away from for all n. It will suffice to this end to show that the probability 
of an isolated triangle occurring in the network is bounded away from 0. An isolated triangle is 
a triple of agents {i,j, k} with links ij, jk, and ik, and no links to anyone else. It is clear that if 
c > v% — v% , then any agent in an isolated triangle wants to sever a link. The probability that three 
given vertices {i, j, k} form an isolated triangle is at least 



p 3 (1 _^3(n-2) 



(11) 



where p is the minimum probability of a link between two vertices and p is the maximum probability 



of such a link. In (11), the first factor is a lower bound on the probability that the three requisite 



links exist, and the second term is a lower bound on the probability that the agents are not linked 
to anyone else. Now, there are possible triples of agents, so (by linearity of expectation) the 
expected number of isolated triangles is at least (g)p 3 ((l — p)^ n ~ 2 \ Since both p and p behave as 
n _1 by Theorem [HJ this expectation is bounded away from 0, as n — > oo. I 
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